04. Quiz: TensorFlow Linear Function
Linear functions in TensorFlow
The most common operation in neural networks is calculating the linear combination of inputs, weights, and biases. As a reminder, we can write the output of the linear operation as
Here, \mathbf{W} is a matrix of the weights connecting two layers. The output \mathbf{y}, the input \mathbf{x}, and the biases \mathbf{b} are all vectors.
Weights and Bias in TensorFlow
The goal of training a neural network is to modify weights and biases to best predict the labels. In order to use weights and bias, you'll need a Tensor that can be modified. This leaves out tf.placeholder()
and tf.constant()
, since those Tensors can't be modified. This is where tf.Variable
class comes in.
tf.Variable()
x = tf.Variable(5)
The tf.Variable
class creates a tensor with an initial value that can be modified, much like a normal Python variable. This tensor stores its state in the session, so you must initialize the state of the tensor manually. You'll use the tf.global_variables_initializer()
function to initialize the state of all the Variable tensors.
Initialization
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
The tf.global_variables_initializer()
call returns an operation that will initialize all TensorFlow variables from the graph. You call the operation using a session to initialize all the variables as shown above. Using the tf.Variable
class allows us to change the weights and bias, but an initial value needs to be chosen.
Initializing the weights with random numbers from a normal distribution is good practice. Randomizing the weights helps the model from becoming stuck in the same place every time you train it. You'll learn more about this in the next lesson, when you study gradient descent.
Similarly, choosing weights from a normal distribution prevents any one weight from overwhelming other weights. You'll use the tf.truncated_normal()
function to generate random numbers from a normal distribution.
tf.truncated_normal()
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))
The tf.truncated_normal()
function returns a tensor with random values from a normal distribution whose magnitude is no more than 2 standard deviations from the mean.
Since the weights are already helping prevent the model from getting stuck, you don't need to randomize the bias. Let's use the simplest solution, setting the bias to 0.
tf.zeros()
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))
The tf.zeros()
function returns a tensor with all zeros.
Linear Classifier Quiz
You'll be classifying the handwritten numbers 0
, 1
, and 2
from the MNIST dataset using TensorFlow. The above is a small sample of the data you'll be training on. Notice how some of the 1
s are written with a serif at the top and at different angles. The similarities and differences will play a part in shaping the weights of the model.
The images above are trained weights for each label (0
, 1
, and 2
). The weights display the unique properties of each digit they have found. Complete this quiz to train your own weights using the MNIST dataset.
Instructions
- Open quiz.py.
- Implement
get_weights
to return atf.Variable
of weights - Implement
get_biases
to return atf.Variable
of biases - Implement
xW + b
in thelinear
function
- Implement
- Open sandbox.py
- Initialize all weights
Since xW
in xW + b
is matrix multiplication, you have to use the tf.matmul()
function instead of tf.multiply()
. Don't forget that order matters in matrix multiplication, so tf.matmul(a,b)
is not the same as tf.matmul(b,a)
.
Start Quiz:
# Solution is available in the other "sandbox_solution.py" tab
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from quiz import get_weights, get_biases, linear
def mnist_features_labels(n_labels):
"""
Gets the first <n> labels from the MNIST dataset
:param n_labels: Number of labels to use
:return: Tuple of feature list and label list
"""
mnist_features = []
mnist_labels = []
mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)
# In order to make quizzes run faster, we're only looking at 10000 images
for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
# Add features and labels if it's for the first <n>th labels
if mnist_label[:n_labels].any():
mnist_features.append(mnist_feature)
mnist_labels.append(mnist_label[:n_labels])
return mnist_features, mnist_labels
# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3
# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)
# Linear Function xW + b
logits = linear(features, w, b)
# Training data
train_features, train_labels = mnist_features_labels(n_labels)
with tf.Session() as session:
# TODO: Initialize session variables
# Softmax
prediction = tf.nn.softmax(logits)
# Cross entropy
# This quantifies how far off the predictions were.
# You'll learn more about this in future lessons.
cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)
# Training loss
# You'll learn more about this in future lessons.
loss = tf.reduce_mean(cross_entropy)
# Rate at which the weights are changed
# You'll learn more about this in future lessons.
learning_rate = 0.08
# Gradient Descent
# This is the method used to train the model
# You'll learn more about this in future lessons.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Run optimizer and get loss
_, l = session.run(
[optimizer, loss],
feed_dict={features: train_features, labels: train_labels})
# Print loss
print('Loss: {}'.format(l))
# Solution is available in the other "quiz_solution.py" tab
import tensorflow as tf
def get_weights(n_features, n_labels):
"""
Return TensorFlow weights
:param n_features: Number of features
:param n_labels: Number of labels
:return: TensorFlow weights
"""
# TODO: Return weights
pass
def get_biases(n_labels):
"""
Return TensorFlow bias
:param n_labels: Number of labels
:return: TensorFlow bias
"""
# TODO: Return biases
pass
def linear(input, w, b):
"""
Return linear function in TensorFlow
:param input: TensorFlow input
:param w: TensorFlow weights
:param b: TensorFlow biases
:return: TensorFlow linear function
"""
# TODO: Linear Function (xW + b)
pass
# Quiz Solution
# Note: You can't run code in this tab
import tensorflow as tf
def get_weights(n_features, n_labels):
"""
Return TensorFlow weights
:param n_features: Number of features
:param n_labels: Number of labels
:return: TensorFlow weights
"""
# TODO: Return weights
return tf.Variable(tf.truncated_normal((n_features, n_labels)))
def get_biases(n_labels):
"""
Return TensorFlow bias
:param n_labels: Number of labels
:return: TensorFlow bias
"""
# TODO: Return biases
return tf.Variable(tf.zeros(n_labels))
def linear(input, w, b):
"""
Return linear function in TensorFlow
:param input: TensorFlow input
:param w: TensorFlow weights
:param b: TensorFlow biases
:return: TensorFlow linear function
"""
# TODO: Linear Function (xW + b)
return tf.add(tf.matmul(input, w), b)
import tensorflow as tf
# Sandbox Solution
# Note: You can't run code in this tab
from tensorflow.examples.tutorials.mnist import input_data
from quiz import get_weights, get_biases, linear
def mnist_features_labels(n_labels):
"""
Gets the first <n> labels from the MNIST dataset
:param n_labels: Number of labels to use
:return: Tuple of feature list and label list
"""
mnist_features = []
mnist_labels = []
mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)
# In order to make quizzes run faster, we're only looking at 10000 images
for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
# Add features and labels if it's for the first <n>th labels
if mnist_label[:n_labels].any():
mnist_features.append(mnist_feature)
mnist_labels.append(mnist_label[:n_labels])
return mnist_features, mnist_labels
# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3
# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)
# Linear Function xW + b
logits = linear(features, w, b)
# Training data
train_features, train_labels = mnist_features_labels(n_labels)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
# Softmax
prediction = tf.nn.softmax(logits)
# Cross entropy
# This quantifies how far off the predictions were.
# You'll learn more about this in future lessons.
cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)
# Training loss
# You'll learn more about this in future lessons.
loss = tf.reduce_mean(cross_entropy)
# Rate at which the weights are changed
# You'll learn more about this in future lessons.
learning_rate = 0.08
# Gradient Descent
# This is the method used to train the model
# You'll learn more about this in future lessons.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Run optimizer and get loss
_, l = session.run(
[optimizer, loss],
feed_dict={features: train_features, labels: train_labels})
# Print loss
print('Loss: {}'.format(l))
User's Answer:
(Note: The answer done by the user is not guaranteed to be correct)
# Solution is available in the other "sandbox_solution.py" tab
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
from quiz import get_weights, get_biases, linear
def mnist_features_labels(n_labels):
"""
Gets the first <n> labels from the MNIST dataset
:param n_labels: Number of labels to use
:return: Tuple of feature list and label list
"""
mnist_features = []
mnist_labels = []
mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)
# In order to make quizzes run faster, we're only looking at 10000 images
for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
# Add features and labels if it's for the first <n>th labels
if mnist_label[:n_labels].any():
mnist_features.append(mnist_feature)
mnist_labels.append(mnist_label[:n_labels])
return mnist_features, mnist_labels
# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3
# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)
# Linear Function xW + b
logits = linear(features, w, b)
# Training data
train_features, train_labels = mnist_features_labels(n_labels)
with tf.Session() as session:
# TODO: Initialize session variables
session.run(tf.global_variables_initializer())
# Softmax
prediction = tf.nn.softmax(logits)
# Cross entropy
# This quantifies how far off the predictions were.
# You'll learn more about this in future lessons.
cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)
# Training loss
# You'll learn more about this in future lessons.
loss = tf.reduce_mean(cross_entropy)
# Rate at which the weights are changed
# You'll learn more about this in future lessons.
learning_rate = 0.08
# Gradient Descent
# This is the method used to train the model
# You'll learn more about this in future lessons.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Run optimizer and get loss
_, l = session.run(
[optimizer, loss],
feed_dict={features: train_features, labels: train_labels})
# Print loss
print('Loss: {}'.format(l))
# Solution is available in the other "quiz_solution.py" tab
import tensorflow as tf
def get_weights(n_features, n_labels):
"""
Return TensorFlow weights
:param n_features: Number of features
:param n_labels: Number of labels
:return: TensorFlow weights
"""
# TODO: Return weights
return tf.Variable(tf.truncated_normal((n_features, n_labels)))
def get_biases(n_labels):
"""
Return TensorFlow bias
:param n_labels: Number of labels
:return: TensorFlow bias
"""
# TODO: Return biases
return tf.Variable(tf.zeros(n_labels))
def linear(input, w, b):
"""
Return linear function in TensorFlow
:param input: TensorFlow input
:param w: TensorFlow weights
:param b: TensorFlow biases
:return: TensorFlow linear function
"""
# TODO: Linear Function (xW + b)
return tf.add(tf.matmul(input,w),b)
# Quiz Solution
# Note: You can't run code in this tab
import tensorflow as tf
def get_weights(n_features, n_labels):
"""
Return TensorFlow weights
:param n_features: Number of features
:param n_labels: Number of labels
:return: TensorFlow weights
"""
# TODO: Return weights
return tf.Variable(tf.truncated_normal((n_features, n_labels)))
def get_biases(n_labels):
"""
Return TensorFlow bias
:param n_labels: Number of labels
:return: TensorFlow bias
"""
# TODO: Return biases
return tf.Variable(tf.zeros(n_labels))
def linear(input, w, b):
"""
Return linear function in TensorFlow
:param input: TensorFlow input
:param w: TensorFlow weights
:param b: TensorFlow biases
:return: TensorFlow linear function
"""
# TODO: Linear Function (xW + b)
return tf.add(tf.matmul(input, w), b)
import tensorflow as tf
# Sandbox Solution
# Note: You can't run code in this tab
from tensorflow.examples.tutorials.mnist import input_data
from quiz import get_weights, get_biases, linear
def mnist_features_labels(n_labels):
"""
Gets the first <n> labels from the MNIST dataset
:param n_labels: Number of labels to use
:return: Tuple of feature list and label list
"""
mnist_features = []
mnist_labels = []
mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)
# In order to make quizzes run faster, we're only looking at 10000 images
for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
# Add features and labels if it's for the first <n>th labels
if mnist_label[:n_labels].any():
mnist_features.append(mnist_feature)
mnist_labels.append(mnist_label[:n_labels])
return mnist_features, mnist_labels
# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3
# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)
# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)
# Linear Function xW + b
logits = linear(features, w, b)
# Training data
train_features, train_labels = mnist_features_labels(n_labels)
with tf.Session() as session:
session.run(tf.global_variables_initializer())
# Softmax
prediction = tf.nn.softmax(logits)
# Cross entropy
# This quantifies how far off the predictions were.
# You'll learn more about this in future lessons.
cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)
# Training loss
# You'll learn more about this in future lessons.
loss = tf.reduce_mean(cross_entropy)
# Rate at which the weights are changed
# You'll learn more about this in future lessons.
learning_rate = 0.08
# Gradient Descent
# This is the method used to train the model
# You'll learn more about this in future lessons.
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
# Run optimizer and get loss
_, l = session.run(
[optimizer, loss],
feed_dict={features: train_features, labels: train_labels})
# Print loss
print('Loss: {}'.format(l))